User centric approach to itemset utility mining in Market Basket Analysis
نویسنده
چکیده
Business intelligence is information about a company's past performance that is used to help predict the company's future performance. It can reveal emerging trends from which the company might profit [31]. Data mining allows users to sift through the enormous amount of information available in data warehouses; it is from this sifting process that business intelligence gems may be found [31]. Within the area of data mining, the problem of deriving associations from data has received a great deal of attention. This problem is referred as “market-basket problem”. Association Rule Mining (ARM), a well-studied technique in the data mining field, identifies frequent itemsets from databases and generates association rules by assuming that all items have the same significance and frequency of occurrence in a record. However, items are actually different in many aspects in a number of real applications such as retail marketing, nutritional pattern mining, etc [26]. Rare items are less frequent items [32]. For many real world applications, however, utility of rare itemsets based on cost, profit or revenue is of importance. For extracting rare itemsets, the equal frequency based approaches like Apriori approach suffer from “rare item problem dilemma”. Utility mining aims at identifying rare itemsets with high utility. The main objective of Utility Mining is to identify the itemsets with highest utilities, by considering profit, quantity, cost or other user preferences [40]. Also valuable patterns cannot be discovered by traditional non-temporal data mining approaches that treat all the data as one large segment, with no attention paid to utilizing the time information of transactions. Now, as increasingly complex real-world problems are addressed, temporal rare itemset utility problem, are taking center stage. In many real-life applications, high-utility itemsets consist of rare items. Rare itemsets provide useful information in different decision-making domains such as business transactions, medical, security, fraudulent transactions, and retail communities. For example, in a supermarket, customers purchase microwave ovens or frying pans rarely as compared to bread, washing powder, soap. But the former transactions yield more profit for the supermarket. A retail business may be interested in identifying its most valuable customers i.e. who contribute a major fraction of overall company profit [40]. In this paper, these problems of analyzing market-basket data are considered and important contributions are presented. It is assumed that the utilities of itemsets may differ and determine the high utility itemsets based on both internal (transaction) and external utilities.
منابع مشابه
Efficient Algorithms for Mining of High Utility Itemsets
--The utility of an itemset represents its importance, which can be measured in terms of weight, value, quantity or other information depending on the user specification. High utility itemsets mining identifies itemsets whose utility satisfies a given threshold. It allows users to quantify the usefulness or preferences of items using different values. Thus, it reflects the impact of different i...
متن کاملA New Algorithm for High Average-utility Itemset Mining
High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...
متن کاملStudy on High Utility Itemset Mining
Data mining is the process of mining new non trivial and potentially valuable information from large data basis. Data mining has been used in the analysis of customer transaction in retail research where it is termed as market basket analysis. Earlier data mining methods concentrated more on the correlation between the items that occurs more frequent in the transaction. In frequent itemset mini...
متن کاملFactorizing Sequential and Historical Purchase Data for Basket Recommendation
Basket recommendation is an important task in market basket analysis. Existing work on this problem can be summarized into two paradigms. One is the item-centric paradigm, where sequential patterns are mined from users’ transactional data and leveraged for prediction. However, these approaches usually suffer from the data sparseness problem. The other is the user-centric paradigm, where collabo...
متن کاملA Survey on Infrequent Weighted Itemset Mining Approaches
Association Rule Mining (ARM) is one of the most popular data mining technique. All existing work is based on frequent itemset. Frequent itemset find application in number of real-life contexts e.g., market basket analysis, medical image processing, biological data analysis. In recent years, the attention of researchers has been focused on infrequent itemset mining. This paper tackles the issue...
متن کامل